Towards a microscopic description of the free-energy landscape of water 
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Free-energy landscape theory is often used to describe complex molecular systems. Here, a mi- 
croscopic description of water structure and dynamics based on configuration-space-networks and 
molecular dynamics simulations of the TIP4P/2005 model is applied to investigate the free-energy 
landscape of water. The latter is built on top of a large set of water microstates describing the kinetic 
stability of local hydrogen-bond arrangements up to the second solvation shell. In temperature space, 
the landscape displays three regions with an overall different organization. At ambient conditions, 
the free-energy surface is characterized by structural inhomogeneities with multiple, structurally 
well defined, short-lived basins of attraction. Below around ambient temperature, the liquid rapidly 
becomes homogeneous. In this regime, the landscape is funneled-like, with fully-coordinated water 
arrangements at the bottom of the funnel. Finally, a third region develops below the temperature of 
maximal compressibility (Widom line) where the funnel becomes steeper with few interconversions 
between microstates other than the fully coordinated ones. Our results present a viable a way to 
manage the complexity of water structure and dynamics, connecting microscopic properties to its 
ensemble behavior. 



INTRODUCTION 

Water is the most studied liquid in nature. Notwith- 
standing the effort, the debate is still open when it comes 
to the connection between atomic properties and its en- 
semble behavior. As an example, the presence of a den- 
sity maximum at a temperature of 4 degrees Celsius [T] is 
qualitatively interpreted as the result of two competing 
driving forces, i.e., the directionality of hydrogen bonds, 
favoring tetrahedral (lower-density) water arrangements 
which are enthalpically stabilized, and entropy maxi- 
mization by non-directional interactions and disorder, re- 
sulting in a closer packing (higher-density). Recent ex- 
perimental and computational evidences pointed out that 
at ambient conditions these arrangements have distinct 
structural and dynamical properties [2;. ,4j . These findings 
might be related to a century-old idea describing water 
anomalies as an emergent property of a mixture-like liq- 
uid [5'48j. This picture is far from conclusive, with other 
groups providing counter evidences to these ideas [9HTT] . 

In other fields, a useful approach to clarify the con- 
nection between microscopic properties and ensemble be- 
havior makes use of the free-energy paradigm. Energy 
landscape theory have demonstrated to be a successful 
approach for the study of the structure and dynamics of 
complex molecular systems [r^HTO] . Within this frame- 
work, molecular dynamics is interpreted as a trajectory 
on the multidimensional free-energy surface. The lat- 
ter is made up of many valleys connected by saddles, 
suggesting that system dynamics can be divided into in- 
travalley and intervalley motions |17| . The former rep- 
resent the oscillations around local minima, while the 
latter involve barrier crossings from one minimum to an- 
other ^8j. Most descriptions of the energy surface of 
water have been limited to ensemble properties or clus- 
ters of a few tens of molecules P^M^Tj . Recently, a map- 



ping of the free-energy surface of bulk water from a more 
microscopic perspective, an extension of configuration- 
space- networks [32 [23] , was proposed |3] • Network con- 
figurations (i.e. nodes) and links represented local wa- 
ter hydrogen-bond arrangements with an extension of 
two solvation shells and the transitions among them as 
sampled by molecular dynamics simulations, respectively. 
This approach represents a reductionist strategy (i.e., fo- 
cusing on the microscopic behavior) to describe the free- 
energy landscape of water. 

In this work, network analysis is extended beyond am- 
bient temperature by running extensive molecular dy- 
namics simulations of the TIP4P /2005 water model from 
340K to the supercooled regime. As the system is cooled 
down, the free-energy landscape is characterized by three 
different regimes. At temperatures up to the physio- 
logical one, the free-energy presents several, short-lived, 
basins of attraction. Below this temperature, the liquid 
becomes homogeneous with the development of a funnel- 
like energy landscape. Finally, by passing the tempera- 
ture of maximum compressibility the funnel rapidly be- 
comes more pronounced with a very strong bias towards 
fully coordinated structures. 

Within the context of water structure and dynamics, 
our results provide a reductionist approach linking mi- 
croscopic behavior to water ensemble properties. 

METHODS 

Molecular dynamics simulations. All molecu- 
lar dynamics simulations were run with the program 
GROMACS |241 |25J. The water box consisted of 1024 
TIP4P/2005 [26] water molecules in the NPT ensemble 
with pressure of 1 atm and temperatures ranging from 
190 K to 340 K with steps of 10 K. The integration time- 
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step was 2 fs, saving coordinates every 4 fs. Equilibration 
was run for 15 ns, followed by a 2 ns long run to collect 
the data. For temperatures below 240K the equilibration 
time was elongated up to 25 ns. During the production 
runs, the system was coupled to a Berendsen barostat 
[?7] and a velocity rescale thermostat [55] with coupling 
times Tp — 1.0 ps and tt — 1.0 ps, respectively. Long 
range electrostatics was computed with PME [29], using 
a 1.0 nm cutoff for all non-bonded interactions. Calcula- 
tions with the TIP3P [SIT and TIP5P [S] models were 
carried out using the same protocol. 

Hydrogen bond definitions. The hydrogen bond 
definition introduced by Skinner and collaborators was 
used [35]. In this approach the orbital occupancy is 
parametrized in terms of the intermolecular distance r 
between the acceptor O and donor H and the angle 
that the O • • • H ray makes with the out-of-plane unit 
vector of the acceptor molecule using the formula: 

iV(r,V') = exp(-r/0.343)(7.1-0.050V' + 0.00021V'^) (1) 

As in the original work, two molecules are considered to 
be hydrogen-bonded if the occupancy iV(r, ij)) exceeds a 
certain threshold {N > 0.0085). To check the robust- 
ness of our results with the hydrogen bond definition, 
the analysis was repeated considering the conventional 
inter-oxygen distance and angle 0-H-O (cutoffs of 3.5 A 
and 30 degrees, respectively f33|). 

Tetrahedral order parameter. The tetrahedral or- 
der parameter for a water molecule i is calculated as 



j=i k=j+i 
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where ipjik is the angle formed by their oxygens [6]. The 
averaged value of this order parameter over an ensemble 
of water molecules is denoted as Q. 



THEORY 

Configuration-space-networks (often referred as 
Markov-State-Models) provide high resolution free- 
energy landscape representations of complex molecular 
systems [H |^ [H [MHSHj- The main idea behind 
this approach is to map the molecular dynamics onto 
a transition network where nodes and links represent 
system configurations and the transitions between them, 
respectively. System configurations are microstates, i.e. 
very small portions of the accessible configuration space. 
Within this approach a variety of molecular trajectories 
from molecular dynamics simulations [TH] [22l [23] to 
single molecule experiments [39ti4T] can be analyzed. 
The great advantage of configuration-space-networks 
with respect to conventional methods is that free- 
energy representations are obtained without the use of 
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FIG. 1. Representative water microstates belonging to the 
four most populated gradient-clusters at 300K. Hydrogen 
bonds are represented as dashed lines. For simplicity of ref- 
erence, each of the four configurations is classified by two 
numbers, indicating the number of donors and acceptors of 
the central water molecule (e.g. 21 stands for two donors and 
one acceptor). 



projections onto arbitrarily chosen order parameters 
[211 [34]. 

Very recently, we showed that configuration-space- 
networks can be used to map the free-energy landscape 
of bulk water at ambient conditions 3 . In this case, 
microstates are defined by hydrogen-bond connectivity 
patterns including the first and second solvation shells 
of a given water molecule (see Fig. [T]). The character- 
ization of the hydrogen-bond pattern is built in a way 
that takes into account of the indistinguishable nature of 
water molecules and all the possible permutations (see 
Ref. [3] for details). Consequently, the configuration- 
space-network of water is built by following the evolution 
of the microstates of each water molecule of the simula- 
tion box, resulting in 1024 trajectories of length 2 ns for 
the present case. Network links are direct molecular tran- 
sitions between water microstates of the same molecule 
at a temporal resolution imposed by the saving frequency 
of Cartesian coordinates (4 fs in the present work). By 
recording link populations, the final configuration-space- 
network is equivalent to a transition matrix. 

As observed elsewhere [TH] (53] [31] , the transition net- 
work resulting from this analysis synthetically encodes 
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FIG. 2. Configuration-space- networks. Pictorial represen- 
tation of the relative balance between intra-basin (Zin) and 
inter-basins (Zout) transition probabilities from the point of 
view of a node (in blue). Gray regions represent free-energy 
basins of attraction as detected by the gradient-algorithm 
[381 [43] (see Theory for details). 

the complex organization of the underlying free-energy 
landscape. Specifically, densely connected regions of the 
network correspond to free-energy basins, i.e., metastable 
regions of the configuration space. Several algorithms 
can be used to extract this information, including the 
max flow theorem [34] , random walks [23l |42j or transi- 
tion gradient analysis [38l [43] . All these approaches aim 
to clusterize the network into kinetically and structurally 
well defined basins of attraction. 

In enthalpy driven free-energy landscapes, of which 
proteins are an archetypal example, the transition prob- 
ability to stay inside a given basin Zm is much larger 
then the probability to hop outside Zo^t [ISl IMj ■ That 
is, basin hoping is a rare event. Moreover, the number of 
neighboring basins is usually very limited, with the emer- 
gence of well defined transition pathways [HJ [52 155] . 
This is not the case for water [3]. Being a liquid, it 
is mainly characterized by entropic basins of attraction. 
As illustrated in Fig. [2j Zi^ and Z^ut become compara- 
ble because the cumulative of the many small inter-basin 
transition probabilities (Zout) is similar to the few highly 
populated intra-basin relax:ations (Zin). In other words, 
the probability to leave the basin is similar to stay in it. 
This observation would lead to the conclusion that, at 
the atomic level, water does not have any type of con- 
figurational selection. However, this is not true when 
considering all the contributions to Zout separately: 

Zout = ^ ^oti't (3) 
i 

Structural inhomogeneities, i.e., configurational selec- 
tion, emerge because 

max (4:^) » max (z« ) , (4) 



meaning that the probability of an intra-basin transition 
is larger than hoping to any other specific basin. When 
this condition holds, the environment of a given water 
molecule alternatively adopts a number of different con- 
figurations, each of them characterized by a specific free- 
energy basin of attraction. This is an emergent property 
of water at ambient temperature [3]. 

Such structural inhomogeneities are found by using a 
gradient-based approach to cluster the transition net- 
work, grouping together water microstates belonging to 
the same steepest descent free-energy pathway [51 [5Sl[i5] . 
From an operative point of view, the algorithm works on 
a per-node basis by deleting all the links (transitions) but 
the most visited one (which represents the local direction 
of the gradient). When applied to the whole network, 
the algorithm provides a set of disconnected trees, each 
of them representing a collective pathway of relaxation 
to the bottom of the local free-energy basin of attraction 
(gradient-cluster, gray regions in Fig. [5]). Each gradient- 
cluster represents a structurally and kinetically well de- 
fined molecular arrangement with an extension of up to 
two solvation shells [3]. 

In this contribution, the behavior of these free-energy 
basins is investigated in temperature space to elucidate 
the global organization of the free-energy landscape, in- 
cluding the relationship between microscopic behavior 
and ensemble properties. 

RESULTS AND DISCUSSION 
The free-energy landscape of water 

Molecular dynamics simulations of the TIP4P/2005 
water model at temperatures from 190K to 340K 
were run and their corresponding configuration-space- 
networks built. For each of these networks, we looked 
for the free-energy basins characterizing local water ar- 
rangements by means of a gradient-cluster analysis (see 
Theory) [1 [^ [33]. In Fig. [sj the population of the 
most visited gradient-clusters is shown as a function of 
temperature. At temperatures larger than 285K, several 
free-energy basins of attraction are found in agreement 
with previous analysis on the SPC model at 300K [3]. 
The structural configurations at the bottom of the most 
visited free-energy basins are pictorially represented in 
Fig. [l] They correspond to the following hydrogen-bond 
configurations of the central water molecule: 2 donors, 
1 acceptor (21, dark blue in Fig. [3j population of 0.32 
at 300K); 2 donors, 2 acceptors (22, fight blue, 0.21); 
1 donor, 2 acceptors (12, red, 0.13); donors, 2 accep- 
tors (02, yellow, 0.01). At the highest temperatures a 
fifth basin appears being characterized by a 11 first sol- 
vation shell (gray). This acceptor/donor representation 
is adopted for simplicity but the contribution of the sec- 
ond solvation shell organization is strictly needed when 
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FIG. 3. Population of the most visited gradient-clusters as a 
function of temperature for the TIP4P/2005 model. Vertical 
lines correspond to 225K and 285K. Tm indicates the melting 
temperature of the model at around 250K I26| . 



it comes to correctly characterize the free-energy basins 
(e.g. there are basins of attraction with the same first 
shell but different second shell [3]). 

In this temperature range the liquid is inhomoge- 
neous in the sense that the local environment of a water 
molecule interconverts between configurations with dis- 
tinct structural and dynamical properties. Those rep- 
resent short-lived metastable arrangements with sub-ps 
lifetime [311]. 

Below 285K this property is lost as shown by the rapid 
increase of the population of the 22 gradient-cluster to a 
value larger than 0.8 (light blue in Fig. [s]). As such, all 
highly populated gradient-clusters collapse to 22, being 
the only largely populated free-energy basin. The popu- 
lation of this basin is almost constant till 225K. In this 
regime, the liquid is homogeneous and the free-energy 
landscape resembles a funnel, with the fully-coordinated 
configuration 22 at the bottom of it (see also Section 
C). The funnel behavior emerges because 22 becomes a 
global attractor of the dynamics as it is the case for the 
native state in protein folding [33]. Still, a cumulative 
population of 0.2 split into roughly six basins survives. 
These configurations are rich of 4-fold hydrogen-bond 
loops, slowly interconverting with the fully coordinated 
configuration. 

Below 225K, i.e. roughly below the temperature of 
maximum compressibility (estimated to be around 230K 
[45]). the entire landscape collapses onto 22 with a much 
more pronounced funnel behavior (see also Section C). 
Interestingly, the temperature of maximum compressibil- 
ity is considered by some as the Widom line, i.e. the prop- 
agation of a liquid-liquid critical point located at higher 




FIG. 4. Distribution of the tetrahedral order parameter Q for 
the three regimes. 



pressure [8ll35ll36]. If this is so, this water regime would 
be connected to the mentioned transition. In this tem- 
perature region the density assumes its minimum value 
(see Fig. [sji) . For this reason, we refer to this tempera- 
ture segment as the low-density homogeneous regime of 
the liquid. 

It is interesting to compare these regimes with the dis- 
tribution of the average tetrahedral order parameter Q 
(see Methods) [B]. In Fig. [I] data for 320K, 240K and 
190K is shown. At around ambient conditions the dis- 
tribution is bi-modal (light gray), indicating that the 
liquid assumes both ordered and disordered atomic ar- 
rangements. This property is lost at lower temperatures 
where the distribution becomes uni-modal (gray) with 
a small population for values close to 0.5. This sub- 
population disappears below the temperature of maxi- 
mum compressibility, resulting in a sharply peaked dis- 
tribution. The shape shift from bi-modal to uni-modal is 
in good agreement with the change from inhomogeneous 
regime to the homogeneous one. 

Summarizing this section, three different regimes for 
the liquid phase of water were found. Each of these re- 
gions is characterized by a specific organization of the 
underlying free-energy landscape as shown by the tem- 
perature dependence of the populations of the major free- 
energy basins (Fig. [3]). 



Network properties and relation to the density 
anomaly 

As a function of temperature, the number of visited 
microstates (i.e. nodes) is increasing monotonously, as 
shown by Fig. [5^. That is, the higher the temperature 
the larger the portion of the configuration space visited 
by the molecular dynamics simulation. Above 225K the 
relation is linear but below this temperature the number 
of microstates changes in a non-linear way, visiting in 
proportion a smaller fraction of the configuration space. 
This behavior might be related to the breakdown of the 
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FIG. 5. Topology of the configuration-space-network as a 
function of temperature, (a) Number of nodes; (b) Num- 
ber of gradient-clusters with a population larger than 0.01; 
(c) Number of connections of the 22 node. For comparison, 
the average number of connections per node d is shown as 
a dashed line (in this case the multiplicative factor is 1 and 
not xlO'^); (d) density. Vertical lines correspond to 225K and 
285K. 



Einstein diffusion relationship below the temperature of 
maximum compressibility as observed for the ST2 model 

EZ!- 

In Fig.[5j3, the number of gradient-clusters with a pop- 
ulation larger than 0.01 as a function of temperature is 
shown. The data presents a step wise behavior, corre- 
lating very well with the presence of the three regimes. 
Interestingly, the number of gradient-clusters is mostly 
constant in the two low-temperature regimes with only 
one free-energy basin below 225K. 

From a network topology point of view, the number of 
connections per node (degree) grows with temperature, 
going from an average value d of 14.47 to 29.55 at 190K 
and 340K, respectively (dashed line in Fig. [5];). This is 
not the case for the node degree of the microstate 22. As 
shown in Fig. [5];, the degree increases up to around 285K. 
Then it starts to decrease where the liquid changes from 
the homogeneous regime to the inhomogeneous one. 

Comparison with density (Fig. [sji) shows a remarkable 
correlation. With a Pearson coefficient of 0.98, the be- 
havior of the node degree of microstate 22 correlates with 
the density anomaly. This seems an interesting fact con- 
necting an ensemble property like the density to a purely 
microscopic quantity, i.e. the number of accessible tran- 
sitions from the fully coordinated configuration 22. 



The origin of a funneled energy landscape 

In this section, the transition network modifications 
corresponding to the three regimes are illustrated. In the 
higher temperatures regime, structural inhomogeneities 



emerge because the maximum of the transition prob- 
ability max (Z^^')) points towards the attractor of the 
basin (e.g. Z in the pictorial representation of Fig. |6]i). 
This is not the case below 285K where the transition to 
22 (2^22) becomes the maximum of the transition prob- 
ability for many nodes acting as attractors at higher 
temperatures. Fig. shows the temperature at which 
Z22 = max (Z*^*)) for the relevant microstates 21 and 12 
(dark gray bars). Relaxing directly to 22, they do not 
build basins of attraction anymore. For nodes not di- 
rectly connected to 22 the relaxation process to it goes 
through two or more steps like for 02. Consequently, a 
free-energy landscape characterized by a single predom- 
inant minimum (22) develops. This type of landscapes 
recall the well-known funnel-landscape paradigm applied 
to protein folding pilHSlHS] . 

Below 225K, Z22 drives the dynamics in a even stronger 
way being the corresponding transition larger than the 
cumulative of all other transitions (gray bars in Fig. |6^) . 
In other words, every time a water molecule assumes a 
configuration different from 22, the probability to go back 
to 22 is larger with respect to the cumulative of any other 
transition. 

From a qualitative point of view the three regimes of 
the free-energy landscape are represented in Fig. ^jp (in 
panel c a pictorial representation of the underlying net- 
work) . 



Robustness and limitations of the current 
methodology 

Other water models present the three free-energy land- 
scape regimes as TIP4P/2005. In Fig.[7]i, the population 
of the most visited gradient-clusters for the TIP3P and 
TIP5P models are shown. The two models are in quali- 
tative agreement with the analysis done for TIP4P/2005 
where the exact temperature values characterizing the 
three regimes depend on the specific water model used. 
This is expected due to the recently observed model de- 
pendent hydrogen-bond temperature shifts |50] . That 
is, while hydrogen-bond patterns between different wa- 
ter models are essentially the same, they differ by a 
temperature shift of up to ~ 65K for the TIP3P model 
when comparing conventional classical water models to 
TIP4P/2005. For example, the appearance of the homo- 
geneous regime is found at 225K for TIP3P, at a roughly 
60K lower temperature than TIP4P/2005 in excellent 
agreement with previous analysis |50j . Similarly, the case 
of TIP5P is in qualitative agreement with TIP4P/2005. 

The origin of the temperature shift is related to the 
relative hydrogen-bond strength differences of the vari- 
ous models [5^ . Consequently, it is expected that artifi- 
cial modifications of the hydrogen-bond strength due to 
more (or less) conservative bond definitions might shift 
the three regimes as well. This is so because water mi- 
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FIG. 6. Schematic representations of the three regimes of liquid water; (a) temperatures at which there is a change in the 
transition probabiUty maximum for the 21 and 12 microstates (see Results for details); (b) free-energy landscape representation; 
(c) network representation. 



crostates are based on hydrogen-bond connectivity and 
its propensity. To check this behavior, the whole analy- 
sis was repeated by using another definition of hydrogen- 
bond based on the classical inter-oxygen distance and 
donor-acceptor angle R9 (see Methods). As shown in 
Fig. [7|d, the overall behavior of the gradient-cluster pop- 
ulations is remarkably similar with the presence of the 
three liquid regimes. On the other hand, the expected 
temperature shift is present. We found that R6 pre- 
dicts a larger number of hydrogen-bonds than the Skinner 
definition. For the former definition, 3.8 and 3.6 aver- 
age number of hydrogen bonds per molecule are found 
at temperatures of 250K and BOOK, respectively. These 
numbers decrease to 3.7 and 3.3 when the Skinner defini- 
tion is used. Using a less conservative definition like R9, 
effectively increases hydrogen-bond strength. As a conse- 
quence, the population of the fully-coordinated gradient- 
cluster is over estimated, giving in turn a temperature 



shift. Being the discussion on the quality of hydrogen- 
bond definitions still open |321 151| , we want to remark 
that the change of the hydrogen-bond definition would 
only slightly affect the exact position of the three regimes 
but not the existence of them. 



CONCLUSIONS 

We have presented extensive molecular dynamics sim- 
ulations of the TIP4P /2005 model for a temperature in- 
terval ranging from 190K to 340K in conjunction with 
complex network analysis to obtain a detailed mapping of 
the underlying free-energy landscape. The main idea was 
to investigate the structural and dynamical properties of 
local water configurations defined by the hydrogen-bond 
connectivity with an extension of two solvation shells. 

From a microscopic point of view, the free-energy land- 
scape of liquid water is characterized by three major 
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FIG. 7. Robustness of the gradient-cluster analysis, (a) 
Gradient-cluster populations for TIP3P and TIP5P water 
models and (b) for TIP4P/2005 by using the RO hydrogen- 
bond definition. 



regimes. At ambient conditions, several metastable water 
configurations with distinct structure and dynamics are 
found [inhomogeneous regime). Below 285K, the free- 
energy landscape develops a funnel dominated by the 
fully coordinated configuration with an extension of at 
least two solvation shells (homogeneous regime). By low- 
ering the temperature below 225K, the funnel becomes 
more pronounced, with the fully-coordinated configura- 
tion becoming a global attractor of the dynamics (homo- 
geneous low-density regime). 

While the three regimes were deducted from water mi- 
croscopic properties, the presence of the tree regimes is 
correlated to the behavior of the density, which is an 
ensemble property of the system. As such, the homo- 
geneous low-density regime spans till the density start 
to grow with a change in concavity at 225K; the homo- 
geneous regime is characterized by the monotonous in- 
crease of the density curve up to the density maximum 
at around 280K; finally, the descending section of the 
density is located into the inhomogeneous regime. 

From an experimental point of view, the presence of 
structural inhomogeneities at ambient temperature is in 
qualitative agreement with small-angle X-ray scattering 
measurements [2] while the presence of multiple kinetics 
is in principle accessible to high order non-linear spec- 
troscopy II]. 
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